Combinatorial Partial Monitoring Game with Linear Feedback and Its Applications

نویسندگان

  • Tian Lin
  • Bruno D. Abrahao
  • Robert D. Kleinberg
  • John Lui
  • Wei Chen
چکیده

In online learning, a player chooses actions to play and receives reward and feedback from the environment with the goal of maximizing her reward over time. In this paper, we propose the model of combinatorial partial monitoring games with linear feedback, a model which simultaneously addresses limited feedback, infinite outcome space of the environment and exponentially large action space of the player. We present the Global Confidence Bound (GCB) algorithm, which integrates ideas from both combinatorial multi-armed bandits and finite partial monitoring games to handle all the above issues. GCB only requires feedback on a small set of actions and achieves O(T 2 3 log T ) distribution-independent regret and O(log T ) distribution-dependent regret (the latter assuming unique optimal action), where T is the total time steps played. Moreover, the regret bounds only depend linearly on log |X | rather than |X |, where X is the action space. GCB isolates offline optimization tasks from online learning and avoids explicit enumeration of all actions in the online learning part. We demonstrate that our model and algorithm can be applied to a crowdsourcing application leading to both an efficient learning algorithm and low regret, and argue that they can be applied to a wide range of combinatorial applications constrained with limited feedback. * The work was done while the author was an intern at Microsoft Research. Proceedings of the 31 International Conference on Machine Learning, Beijing, China, 2014. JMLR: W&CP volume 32. Copyright 2014 by the author(s).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Game Theory based Energy Efficient Hybrid MAC Protocol for Lifetime Enhancement of Wireless Sensor Network

Wireless Sensor Networks (WSNs) comprising of tiny, power-constrained nodes are getting very popular due to their potential uses in wide applications like monitoring of environmental conditions, various military and civilian applications. The critical issue in the node is energy consumption since it is operated using battery, therefore its lifetime should be maximized for effective utilization ...

متن کامل

Partial Eigenvalue Assignment in Discrete-time Descriptor Systems via Derivative State Feedback

A method for solving the descriptor discrete-time linear system is focused. For easily, it is converted to a standard discrete-time linear system by the definition of a derivative state feedback. Then partial eigenvalue assignment is used for obtaining state feedback and solving the standard system. In partial eigenvalue assignment, just a part of the open loop spectrum of the standard linear s...

متن کامل

Some Results about the Contractions and the Pendant Pairs of a Submodular System

Submodularity is an important  property of set functions with deep theoretical results  and various  applications. Submodular systems appear in many applicable area, for example machine learning, economics, computer vision, social science, game theory and combinatorial optimization.  Nowadays submodular functions optimization has been attracted by many researchers.  Pendant pairs of a symmetric...

متن کامل

Multiple attribute decision making with triangular intuitionistic fuzzy numbers based on zero-sum game approach

For many decision problems with uncertainty, triangular intuitionistic fuzzy number is a useful tool in expressing ill-known quantities. This paper develops a novel decision method based on zero-sum game for multiple attribute decision making problems where the attribute values take the form of triangular intuitionistic fuzzy numbers and the attribute weights are unknown. First, a new value ind...

متن کامل

Primal-Dual Combinatorial Algorithms

Linear program and its duality have long been ubiquitous tools for analyzing NP-hard problems and designing fast approximation algorithms. Plotkin et al proposed a primaldual combinatorial algorithm based on linear duality for fractional packing and covering, which achieves significant speedup on a wide range of problems including multicommodity flow. The key ideas there are: 1) design a primal...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014